Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add TraceQL query hint to retrieve most recent results ordered by trace start time #4238

Merged
merged 19 commits into from
Feb 12, 2025

Conversation

joe-elliott
Copy link
Member

@joe-elliott joe-elliott commented Oct 25, 2024

What this PR does

Adds a query hint with (most_recent=true) that causes Tempo to perform a more intense search to return the most recent traces ordered by start time. This is currently implemented as a query hint b/c (depending on the circumstances) it can have a quite large impact on time to return results in both streaming grpc and discrete http paths. A TraceQL hint was chosen over a query param b/c it is desired longer term that this is the default behavior and it will be easier to remove a query hint then a query parameter.

This change is implemented at the query frontend, query engine and ingester search levels. In all cases it is required to store the most recent results and continue searching instead of immediately returning once a limit is hit.

Other Changes

Which issue(s) this PR fixes:
This PR will give a way to always return the most recent results ordered by start time which partially addresses #3777, #3109 and #2659.

Checklist

  • Tests updated
  • Documentation added
  • CHANGELOG.md updated - the order of entries should be [CHANGE], [FEATURE], [ENHANCEMENT], [BUGFIX]

Signed-off-by: Joe Elliott <[email protected]>
Signed-off-by: Joe Elliott <[email protected]>
Signed-off-by: Joe Elliott <[email protected]>
Signed-off-by: Joe Elliott <[email protected]>
Signed-off-by: Joe Elliott <[email protected]>
Signed-off-by: Joe Elliott <[email protected]>
Copy link
Contributor

@knylander-grafana knylander-grafana left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for updating the docs! Updates look good.

}
i.headBlockMtx.RUnlock()
if err := anyErr.Load(); err != nil {
return nil, err
}
if combiner.Count() >= maxResults {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One benefit of this early return was that it didn't have to take the blocks mutex if the search was satisfied by the head block. Is that still possible if the query isn't using the mostrecent hint?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yup, that check has been moved here. for the anyCombiner the time is ignored and it's just the old limit check.


if c.Count() == c.keepMostRecent && c.keepMostRecent > 0 {
// if this is older than the oldest element, bail
if c.OldestTimestampNanos() > new.StartTimeUnixNano {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have a feeling that this function is correct, but I need help understanding a bit more. Wondering if it should check the new timestamp cutoff before merging with existing? Or generally: how the timestamp/sorting works when merging. If we get an older partial snippet of an existing entry, it will overwrite the start time to be older. In that case should the result be kicked out? If the trace had been fully combined into one snippet (perfect compaction), it seems like it would have been kicked out (properly counted as older than other results).

Copy link
Member Author

@joe-elliott joe-elliott Nov 7, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, these are good callouts. i wrote this logic thinking about the combiner being used in the engine, but the truth is there is all kinds of weird behavior with fractured traces when used in the query frontend. i don't know if there are good answers for all these edge cases without reconsiderations at the storage layer. this may be a feature that is always best effort.

This is weird:

  • Add metadata for trace id 1234 to the most recent list
  • It gets pushed out by more recent traces
  • Receive a new shard for 1234 that re-qualifies it for the most recent list
  • We correctly return 1234 but have lost some information about it

Or this:

  • Receive 10 "most recent" traces and return a result to the client
  • There is a trace shard earlier in the search window which pushes one of the returned results out of the most recent list.
  • We have incorrectly returned a trace that should not have been in the list.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you explain why this is the opposite of line 140? because this is saying that if the oldestTimeStamp is older than the new metadata then discard the new data.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

adding to the edge case scenarios, what about traces that have root spans that not yet received that would put it outside of the search window?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you explain why this is the opposite of line 140? because this is saying that if the oldestTimeStamp is older than the new metadata then discard the new data.

line 140 is checking if its newer than the oldest timestamp. if it is newer than it converts the spanset to metadata and attempts to add it. line 159 is check if its older than the oldest timestamp. if its older it doesn't add it. the first is a check to include the spanset. the second is a check to exclude the spanset

adding to the edge case scenarios, what about traces that have root spans that not yet received that would put it outside of the search window?

yeah. i think that's another case (similar to the second one i mentioned) where not discovered information will push a trace out of the window.

i think an exhaustive search would be required to completely remove all edge cases.

@joe-elliott joe-elliott requested a review from mdisibio November 7, 2024 20:31
Copy link
Contributor

github-actions bot commented Jan 7, 2025

This PR has been automatically marked as stale because it has not had any activity in the past 60 days.
The next time this stale check runs, the stale label will be removed if there is new activity. This pull request will be closed in 15 days if there is no new activity.
Please apply keepalive label to exempt this Pull Request.

@github-actions github-actions bot added the stale Used for stale issues / PRs label Jan 7, 2025
@bastischubert bastischubert added keepalive Label to exempt Issues / PRs from stale workflow and removed stale Used for stale issues / PRs labels Jan 9, 2025
@knylander-grafana knylander-grafana added the type/docs Improvements or additions to documentation label Jan 15, 2025
@joe-elliott joe-elliott removed the type/docs Improvements or additions to documentation label Jan 16, 2025
return s.completedThroughSeconds
}

s.shards = shards
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is there no possibility of the responses coming in out of order?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

there absolutely is. in fact most of the complexity of this code is due to the fact that responses can and do come in any order. there should be only one shards response though.

these tests attempt to cover all out of order cases i could think of:

https://github.com/grafana/tempo/pull/4238/files#diff-01e87bde5e33cf0e55bb439f54ed688f4c57e4b5843c959454a9bf6760fcc97cR663


if c.Count() == c.keepMostRecent && c.keepMostRecent > 0 {
// if this is older than the oldest element, bail
if c.OldestTimestampNanos() > new.StartTimeUnixNano {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you explain why this is the opposite of line 140? because this is saying that if the oldestTimeStamp is older than the new metadata then discard the new data.


if c.Count() == c.keepMostRecent && c.keepMostRecent > 0 {
// if this is older than the oldest element, bail
if c.OldestTimestampNanos() > new.StartTimeUnixNano {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

adding to the edge case scenarios, what about traces that have root spans that not yet received that would put it outside of the search window?

@ie-pham
Copy link
Contributor

ie-pham commented Feb 11, 2025

I've reviewed as thoroughly as I can and nothing obvious is jumping out at me. Have you done any benchmarks on if this has any big impact to performance for the regular (non-recent) search with the new implementation? I don't believe much was changed but that would be a great assurance that nothing is broken. Otherwise, I'm good to approve.

@joe-elliott
Copy link
Member Author

joe-elliott commented Feb 12, 2025

There is minimal impact to the normal search path. It does unnecessarily create shards, return and track them on the normal search path. Testing did not show any performance difference and I was hesitant to wire up all the complexity necessary to not do this work. wdyt?

edit: i added code to skip tracking of searches w/o the query hint. it's also possible to not create the objects necessary for tracking but would require a bit more rewiring.

@ie-pham
Copy link
Contributor

ie-pham commented Feb 12, 2025

hit it

@joe-elliott
Copy link
Member Author

hitting it!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
keepalive Label to exempt Issues / PRs from stale workflow
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants